What is claimed is: 



Claims 



1. A method for transcribing speech of a plurality of speakers, comprising: 
providing said speech to a plurality of speech decoders, each of said 

decoders using a speaker model for one of said speakers and generating a confidence 
score for each decoded output; and 

selecting a decoded output based on said confidence score. 

2. The method of claim 1, further comprising the step of aligning each of said 
decoded outputs in time. 

3. The method of claim 1, wherein one or more of said speech decoders are 
on a remote server. 

4. The method of claim 1, further comprising the step of presenting said 
selected decoded output to a user. 

5. The method of claim 1, further comprising the step of manually selecting 
an alternate decoded output if said assigned output is incorrect. 

6. The method of claim 5, further comprising the step of adapting said 
selecting step based on said manual selection. 

7. The method of claim 1, further comprising the step of presenting several 
decoded outputs to a user with an indication of said corresponding confidence score. 
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8. The method of claim 1, further comprising the step of presenting said 
decoded output as a string of words if said corresponding confidence score exceeds a 
certain threshold and as a string of phones if said corresponding confidence score is 
below a certain threshold. 

9. The method of claim 1, further comprising the step of presenting said 
decoded output as a string of words for the decoded output having the highest confidence 
score and as phones or syllables for all other decoded outputs. 

10. The method of claim 1, wherein said selecting step further comprises the 
step of determining if a decoded output includes an isolated word from a second speaker 
in a string of words from a first speaker. 

11. A method for transcribing speech of a plurality of speakers, comprising: 
providing said speech to a speaker independent speech recognition system 

and a speaker specific speech recognition system; and 

decoding said speech using said speaker independent speech recognition 
system whenever the identity of the current speaker is unknown. 

12. The method of claim 11, wherein said decoding step continues until a 
speaker identification system identifies an unknown speaker. 

13. The method of claim 1 1, wherein one or more of said speaker independent 
speech recognition system and said speaker specific speech recognition system are on a 
remote server. 
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14. The method of claim 11, further comprising the step of presenting said 
selected decoded output to a user. 

15. A method for transcribing speech of a plurality of speakers, comprising: 
providing said speech to a speaker independent speech recognition system 

and a speaker specific speech recognition system; and 

decoding said speech using said speaker specific speech recognition 
system with a speaker model for an identified speaker until there is a speaker change. 

16. The method of claim 15, further comprising the step of decoding said 
speech using a speaker independent speech recognition system until the identity of a 
speaker is determined and the appropriate speaker model is loaded. 

17. The method of claim 15, wherein one or more of said speaker independent 
speech recognition system and said speaker specific speech recognition system are on a 
remote server. 

18. The method of claim 15, further comprising the step of presenting said 
selected decoded output to a user. 

19. A system for transcribing speech of a plurality of speakers, comprising: 
a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 

provide said speech to a plurality of speech decoders, each of said 
decoders using a speaker model for one of said speakers and generating a confidence 
score for each decoded output; and 
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select a decoded output having a highest confidence score. 

20. The system of claim 19, wherein said processor is further configured to 
align each of said decoded outputs in time. 

21. The system of claim 19, wherein one or more of said speech decoders are 
on a remote server. 

22. The system of claim 19, wherein said processor is further configured to 
present said selected decoded output to a user. 

23. A system for transcribing speech of a plurality of speakers, comprising: 
a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 

provide said speech to a speaker independent speech recognition system 
and a speaker specific speech recognition system; and 

decode said speech using said speaker independent speech recognition 
system whenever the identity of the current speaker is unknown. 

24. The system of claim 23, wherein said processor performs said decoding 
until a speaker identification system identifies an unknown speaker. 

25. The system of claim 23, wherein one or more of said speaker independent 
speech recognition system and said speaker specific speech recognition system are on a 
remote server. 
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26. The system of claim 23, wherein said processor is further configured to 
present said selected decoded output to a user. 

27. A system for transcribing speech of a plurality of speakers, comprising: 
a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 

provide said speech to a speaker independent speech recognition system 
and a speaker specific speech recognition system; and 

decode said speech using said speaker specific speech recognition system 
with a speaker model for an identified speaker until there is a speaker change. 

28. The system of claim 27, wherein said processor is further configured to 
decode said speech using a speaker independent speech recognition system until the 
identity of a speaker is determined and the appropriate speaker model is loaded. 

29. The system of claim 27, wherein one or more of said speaker independent 
speech recognition system and said speaker specific speech recognition system are on a 
remote server. 

30. The system of claim 27, wherein said processor is further configured to 
present said selected decoded output to a user. 

31. An article of manufacture for transcribing speech of a plurality of 
speakers, comprising: 

a computer readable medium having computer readable code means 
embodied thereon, said computer readable program code means comprising: 



YOR920010454US1 



-16- 



a step to provide said speech to a plurality of speech decoders, each of said 
decoders using a speaker model for one of said speakers and generating a confidence 
score for each decoded output; and 

a step to select a decoded output having a highest confidence score. 

32. An article of manufacture for transcribing speech of a plurality of 

speakers, comprising: 

a computer readable medium having computer readable code means 
embodied thereon, said computer readable program code means comprising: 

a step to provide said speech to a speaker independent speech recognition 
system and a speaker specific speech recognition system; and 

a step to decode said speech using said speaker independent speech 
recognition system whenever the identity of the current speaker is unknown. 



33. An article of manufacture for transcribing speech of a plurality of 

speakers, comprising: 

a computer readable medium having computer readable code means 
embodied thereon, said computer readable program code means comprising: 

a step to provide said speech to a speaker independent speech recognition 
system and a speaker specific speech recognition system; and 

a step to decode said speech using said speaker specific speech recognition 
system with a speaker model for an identified speaker until there is a speaker change. 
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